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REMARKS 

L INTRODUCTION 

In response to the Office Action dated March 4, 2004, no claims have been canceled, 
amended or added. Claims 1-57 remain in the application. Entry of the following remarks, and re- 
consideration of the application, is reqiiested 

n. PRIOR ART REJECTIONS 

A. 7^e Office Action Rejecrions 

In paragraphs (2)-(3) of die Office Actbn, daims 1, 20, and 39 were rejected mder 35 
U.S.C. §l03(a) as being unpatentable over Bayer ec al., U.S. Patent No. 5,202,987 (Bayer) in view of 
Tsuchida et aL, U.S. Patent No, 6,026,394 (Tsuchida). In paragraph (5) of the Office Action, claims 
1, 20, and 39 were rejected under 35 U.S.C. §103(a) as being unpatentable over Bhattacharya et aL, 
U.S. Patent No. 5,797,000 (Bhattacharya). In paragraph (7) of the Office Action, claims 2-3, 21^22, 
and 40-41 were rejected under 35 U.S.C. §103(a) as being unpatentable over Bhattacharya as applied 
to claims 1, 20, and 39, in view of Hintz et al., U.S. Patent No. 5;222,235 (Hinrz). In paragraph (11) 
of die Office Action, claims 4-6, 23-25, and 42-44 were reje<i;ted under 35 U.S.C. §103(a) as being 
unpatentable over Bhattacharya as applied to ckim$ 1, 20, and 39 in view of Bordonaro et al, U.S, 
Patent No. 5,307,485 (Bordonaro). In paragraph (16) of die Office Acrion, claims 7-11, 26-30, and 
45-49 were rejected under 35 U.S.C. §1 03(a) as being unpatentable over Bhattacharya as applied to 
claims t, 20, and 39 in view of an "Office Notice" (ON). However, in paragraph (18) of die Office 
Acdon, claims 12-19, 31-38, and 50-57 were indicated as being allowable if rewritten in independent 
form to include the base claim and any intervening claims. 

Applicants' attorney acknowledges the indication of allowable claims, but respectfully 
traverse these rejections, 

B- A pplicants' Independent Claims 

Applicants' independent claims 1,16 and 30 are directed to loading data into a data store 
connected to a computer. Independent claim 1 is representative and comprises the steps of: 
identifying memory constraints; 
identifying processing capabilities; and 

determining a niaiiber of load and sort processes to be started in parallel based on the identified 
memory constraints and processing capabilities. 
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C. The Bayer Reference 

Bayer describes a high flow-rate synchronizer/schedukr apparacus for a mudproctissot 
system during program run-time, which comprises a connection matrix for monitoring and detecting 
computational tasks which are allowed for execution containing a task map and a network of nodes 
for distributing to the processors information or computational tasks detected to be enabled by the 
connection matrix. The network of nodes possesses the capability of decomposing information on a 
pack of allocated computational casks into messages of finer sub-packs to be sent toward the 
processors, as well as the capabihty of uoifying packs of information on te rmina tion of 
computational tasks into a more comprehensive pack. A method of performing the 
synchromzation/schcdnling in a multiprocessor system of this appaiams is also described. 

D. The Tsuchida Reference 

Tsuchida describes a database management system for executing database operations in 
parallel by a plurality of nodes and a quety processing method. The database management system 
contains a decision management node for deciding a distribution node for retrieving information so 
as to analyze a query received firom an application program, generate a processing procedure for 
processing die query, and execute the process, and a join node for sorting, merging, and jommg the 
information remeved by the distribution node. When die query process is executed, the distribution 
node decided by the decision management node retrieves the information to be processed and the 
join node decided by the decision management node also obtains die result for the query from the 
retrieved information. The query result is outputted from an output node and rransferred to the 
application program- 
IE. The Bhattac haiya Reference 

Bhattacharya describes a method of performing a parallel join operation on a pair of 
relations Rl and R2 in a system containing P processors organized into Q dusters of P/Q 
processors each. The system contains disk storage for each duster, shared by the processors of that 
duster, together with a shared intermediate memoiy (SIM) accessible by aU processors. The relations 
Rl and R2 to be joined are first sorted on dae join column. The underlying domain of the join 
column is then partitioned into P ranges of equal size. Each range is further divided into M 
subranges of progressivdy decreasing size to create MP tasks T.sub.m,p, the subranges of a given 
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range being so sized relarive to one another that the estimated completion time for task T.sub,m,p is 
a predetennincd feaction that of task T.sub,m-l,p. Tasks T.sub.tn,p with larger rime estimates are 
assigned (and the corresponding mples shipped) to the duster to which processor p belongs, while 
tasks with smaller time estimates are assigned to the SIM, which is regarded as a universal dustet 
(duster 0), The actual task-to-processor assignments are determined dynamically during the join 
phase in accordance widi the dynamic longest processing time first (DLPT) algoridim. Each 
processor within a dustet picks its next task at any given dedsion point to be the one with the 
lajgest time estimate which is owned by that daster or by duster 0. 

F. The Hintz Reference 

Hintz describes a reorganization method of DB2 data files exploring paraRd processing, and 
asynchronous I/O to a great extent. Itindudes means to estimate an optimum configuration of 
system resources, such as storage devices (DASD devices), memory, and CPUs, etc, during 
reorganizarions. The method mainly consists of four components, (1) concurrent indexing, (2) 
concurrent unloadiog of data file partitions, (3) effident rdoading of DB2 data pages and DB2 space 
maps, and (4) means to reduce access constraints to the DB2 recovery table. 

G. The Bordonaro "Reference 

Bordonaro describes a system and method for me^^jing a pluraliry of sorted lists using 
mulriple processors having access to a common memory in which N sorted lists which may exceed 
the capadty of the common memory are merged in a parallel environment Sorted lists fcom a 
storage device ate loaded into common memory and are divided into a number of tasks equal to the 
number of available processors. The recotds assigned to each task are separatdy sorted, and used to 
form a single sorted Usl A multi-processing environment takes advantage of its organization during 
die Creadon of the tasks, as well as during the actual sorting of the tasks. 

FL Applicants' Claimed Invention Is Patentable Over The References 
Applicants' attorney respectfully submits that Applicants' claimed invention is patentable 
over the references. Specifically^ Applicants' attorney asserts that die references do not teach or 
suggest the limitations redted in Applicants' independent claims 1, 20 and 39. 

With regard to die rejections based on Bayer and Tsuchida, the Office Action states the 
following: 
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3, Chsm$ 1, 20, and 39 ate rejected under 35 U.S.C. 103(a) as being 
unpatentable ovcx Bayer et aL (US Pat 5,202,987, hereinafter Bayer) in view of 
Tsuchida et aL (US Pat 6»026,394, hereinafter Tsuchida). 

4. Regarding claims 1, 20, and 39, Bayer et al. disclose a method of 
loading data into a data store connected to a cort^uter, the method comprising the 
steps of: 

identifying memory constraints (coL 1, lines 13 - 15, memoxy and processors 
are operations bottleneck and col. 5, hnes 52 - 56, memory is constrained or limited 
through factors such as physical shared storage, network access or processor 
distribution, and common memory space); 

identifying processing capabilities (coL 1, lines 17-27, synchronization 
activities are controlled by algorithm, which depends on processing power and coL 5, 
lines 24 - 31 requires the number of processors and capabilities of each processor, 
which entail processing capabilities); and 

determining a number of load (coi 14, Hnes 25 - 33, loading capacity being 
part of task map) to be started in parallel based on the identified memory constraints 
and processing capabilities (coL 7, Hnes 9 -1), 

Although Bayer disclose the sort process being a mere tasks allocation to the 
processors (col. 1 , liiies 44 - 50), T$uchida has nevertheless further detailed the sort 
feature, which includes the step of determining a number of sort processes (coL 8, 
Hnes 50 - 51 disclose the fact that the sorting process depends on the number of 
node for join process. CoL 7, lines 54 - 57 show that the number of join nodes for 
performing merge process can be determined. Hence, number of sort processes is a 
known quantity). 

It is considered obvious to one of ordinary skill in the art, at the time the 
invention was made, to combine the sorting feature shown by Tsuchida to the 
invention of Bayer so that sort processing rime, which is a &ctor in load balancing 
processes can be determined as part of system chaxacceristics and optimization 
purposes (Tsuchida: coL 7, Hnes 58 - coL 8, line 35). Note that the sort steps shown 
by Tsuchida are also parallel processes as claimed in the application (fig. 3, parallel 
pipeUne operation). 

The Oftice Action also states the following: 
Response to Arguments 

19. AppHcant's arguments filed 12/30/03 have been fully considered but 
they are not persuasive for the reasons set forth below. 

20. In response to appHcants' remarks, page 20, lines 1-3, and page 22, 
4th paragraph, regarding Bayer, Tsuchida and Bhattacharya references do not teach 
c>r surest the limitation "determining a number of load and sort processes to be 
started parallel based on die identified memory constraints and processing 
capabihties", this has already been addressed as stated in the rejection above- 
Additionally, Tsuchida's fig. 2 teaches a pluraHty of processors implementing 

parallel operations in a database management system. Fig 3 teaches management 
node 12 determines the distribution process with the retrieval data (loading process) 
by execution on the basis of niomber of load and sort process (coL 10, Hne 60 - coL 
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ll> line 16). Tsuchida also teaches of dynamic oprimizadon process to produce an 
optimal result within die system processing capabilities by analyzing die processing 
procedures that are applicable based on the constraints (fig. 11 f, 12b) induding 
calculating the processing tiiae considering each system characteristic (coL 3, lines 6 - 
47, col. 12. lines 9 -35). 

Applicants* attorney disagrees. The cited portions of these references do not teach or 
su^sc die limitation "determining a number of load and sort processes to be started in paraM based 
on the identified memory constraints and processing capabilities," 

For example, the cited portions are set forth below. 

Bayer: Col 1. lines 13-27 

The coordination of multiple operations in shared memory multiprocessors 
often constitutes a substantial performance bottleneck. Process synclironizarion and 
scheduling are generally performed by software, and managed via shared memory. 
Execution of parallel programs on a sbared-memory> speedup-oriented 
multiprocessor necessitates a means for synchronizing the activities of the individual 
processors. This necessity arises due to precedence constraints within algontbm«: 
When one computation is dependent upon the result of other computations, it mu$t 
not commence before they finish. In the general case, such constraints are projected 
onto an algorithm's parallel decomposition, and reflected as precedence relations 
among its execution threads. 

Bayer; CoI. 5, lines 24-31 (actually. 24-44^ 

In addition to a task map, the synchronizer/scheduler is supplied with the 
system configuration data. This includes such details as the number of processors, 
the capabilities of each processor (if processors are not a-priori identical), etc. 

Given a set of enabled tasks, as well as processor availability data, the 
synchronizer/scheduler then performs scheduling of those tasks. Any non-random 
scheduling policy must rely upon some heuristics: Even when task execution times 
are known in advance, finding an optimal schedule for a program represented as a 
dependency graph is an NP-complete problem. Most scheduling heuristics are bases 
on the critical path mediod, and thereby belong to the dass of list scheduling 
policies; Le., policies that rely on a list of fixed task ptiorities. list scheduling can be 
supported by the inventive scheme described herein, by embedding task priorities in 
the task map load-module subtnitted to the synchronizer/ schedxiler. Whenever an 
allocation takes place, the allocated casks are those wliich have highest priorities 
amongst the current selection of enabled tasks. 

Bayer; CoL 14^ lines 25-33 
Characterizing Parameters 

The parameters characterizing a specific synchronizer/ scheduler can now be 
summarized: 

Loading Capacity: 
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The maximal si2e of a task map which can be loaded. This parameter is 
express d in terms of qurtauty of taslcs, and/or in tenns of quantity of dependency 
connections. 

Bayer: Col. 7. lines 9-11 ractually. 9-1 3^ 

The mulriprocessot architecture is illustrated in FIG. 1 . As can be seen, the 
parallel operation coordination subsystem (synchroniact/gcheduler 10) forms an 
appendage to a conventional configuration of a shaiod-memory 12 and processors 
14. 

Tsucbida: Col. 8. lines 50-51 ( armalty 44-<^R} 

FIG. 8 is a schematic view of the tuning by the slot sort preprocessing. In the 
same way as in FIG. 7, it is assumed that the data retneval/ distribution process is 
executed in die nodes #1 to #8 and the processing time in each node is the one 
shown at each of the numbers 300 to 305. The processing nm<* in each node varies 
with the number of data in each table. The slot sorting process is set so as to be 
executed by the nodes for join process. When the processing rime varies with each 
node, the processing procedure for ttansferring the slot sorting process to the nodes 
for data retrieval/dismbution is considered For example, in a node where the data 
retdeval/distribution process is expected to end earlier as slot sort preprocessing, the 
slot sorting process is executed as shown at 306 to 309. By performing the slot sort 
preprocessing in this manner, the slot sort processing time by the nodes for join 
process can be reduced to about the value shown at 312. Using the reduced 
processing time shown at 31 1, the N-way merge process is transferred. This is 
nothing but extension of the run length of the slot sotting process. By doing this> the 
time 320 required for the N-way merge process can be reduced and as a resiolt, the 
total response time can be reduced. 

Tsuchida: CoL 7, lin? 54 - col, 8 , line 35 

Next, the method for deciding the number of join nodes for performing tlie 
N-way merge process will be explained with reference to FIG. 7. FIG, 7 is a 
schematic view for explaining the decision method for the number of join nodes. In 
FIG. 7^ graphs of the phases of parallel join process explained in FIG. 3 and of the 
processing time of each process are made and laid out according to the parallel 
pipeline operation expiated in FIG. 4. In FIG. 7, it is assumed that the data 
retrieval/ discriburion process is executed in the nodes #1 to #8 and the processing 
time in each node is the one shown at each of the numbers 300 to 305. lii this 
example, the processing time 304 in the node #5 is the maximum processing time. 
The slot sorting processing time can be driven from the number of nodes for join 
process N, predetermined system characteristics (CPU performance, disk unit 
performance, etc.), and database operation method The performance characteristic 
{processing time Es) of the slot sorting process can be obtained generally from the 
following expression. 

Es=a/N+b*N+c (1) 

The N-way merge processing time (Em) and join processing time (Ej) also 
can be obtained from the fbJlowing expressions. 
Em=d/NH-e*N+f (2) 
Ej=g/N+h*N-t-i (3) 

-17- 

G&C 30571.279-US-Ol 



PAGE 2(I/27'RCVDAT«04 2:53:58 PM [Eastern DaylightM^ 



05-04-2004 n :07AM FROM-Gatas 4 Cooper LLP +13106418798 T-429 P. 021/027 F-535 



where, symbols a, d, and g indicate constants which are decided £com system 
chamcterisrics such a$ the number of rows, the number of pages, each operation uoit 
time, and output time. Symbols b, e, and h are constants which are decided from 
system characcerisrics such as the communication time, and c, £, and i are constants 
which axe decided &om die other system charactensncs 

AccordtQg to this embodiment, to maximize the effect of the pipeline 
process, the number of nodes for join process is obtained as the number of assigned 
join nodes 350 so that the performance characteristic Es of die slot sorting process 
becomes equal to the maximum processing time 304. When the number of assigned 
join nodes 350 is determined, the N-way merge processing time 320 and join 
processing time 330 can be estimated &om the equations (2) and (3). The total of 
these processing times is the total processing time for a query. By deciding the 
number of join nodes in this manner and merging the data distributed in the data 
retrieval/distdbution process successively and processing them simultaneously, the 
total processing time (response time from querying to output) can be shortened. 

Tsuchida: Col. 1 0. line 60 - col. 11. line 49 

FIG. 11(e) shows a detailed flow chart of the process fox generation of 
processing procedure candidates (Step 2213). The process for generation of 
processing procedtite candidates checks whether the table to be accessed for the 
query i$ separately stored iu a plurality of nodes (Step 22130). When the table is 
separately stored in a plurality of nodes, the database management system goes to 
Step 22135. When the table is not separately stored, the process for generation of 
processing procedure candidates checks whether the sorting process is necessary for 
executing the query (Step 22131). When the sorting process is necessary for the 
query process, the database management system goes to Step 22135. When the 
sorting process is not necessary for the processing procedure candidates, die process 
for generation of proceSs^ing procedure candidates checks whether there is only one 
access padi for the table to be accessed for the query (Step 22132). When there is 
only one access path, the process for generation of processing piocedure candidates 
generates a siugle processing procedure corresponding to the access path and ends 
the processing (Step 22133). When there is not only one access patii the process for 
generadon of processing procedure candidates generates a plurdity of processing 
procedures corresponding to the access paths and ends the processing (Step 2213^, 
At Step 22135, the process for generation of processing procedure candidates 
decomposes the query to two-way joios which are joinablc. Next, the process for 
generadon of processing procedure candidates generates processing procedure 
candidates for data read on the basis of the registered access path candidates and 
processing procedure candidates for data disttibution according to the 
decomposition result at Step 22135 in correspondence with the storing nodes where 
the table is separately stored. The process for generation of processing procedure 
candidates also generates processing procedure candidates for slot sorting when the 
slot sorting process is to be executed in the storing nodes. The process for 
generation of processing procedure candidates registers the processing procedure 
consisting of a combination of these processing procedure candidates as a processing 
procedure candidate in each distribution node (Step 22136). The process for 
generation of processing ptocedme candidates registers the processing ptocedxire 
consisting of a combination of the slot sorting process procedure, N-way merge 
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processing procedure, and join processing procedure as a processing procedure 
candidate in each join node in correspondence with each join processing node. Then, 
the process for generation of processing procedme candidates parameterizes the slot 
sorting run length and the nximber of merging times (Step 22137). The process for 
generatioa of processing procedure candidates registers the requested data output 
processing procedure to the requested data output node as a processing procedure 
candidate in the ouq)Ut node (Step 22138). Finally, the process for generation of 
processing procedure candidates ends the processing when the decomposition 
results are all evaluated and repeats Step 22135 and the subsequent steps when any 
decomposition results are not evaluated (Step 22139), 

Tsuchida: Col. 3, lines 6-47 

Furthermore, the plurality of nodes include at least one decision management 
node having an analysis means of receiving a query, analyzing the query, and 
generating die query processing procedure, a decision means of deciding the 
distdbudon nodes and join nodes for performing the execudon process on the basis 
of the query analysis result of the above analysis means, and an output means of 
outputting the result for the query obtained from die join node. The decision means 
of die decision management node desirably decides the disttibudon node on the 
basi$ of the query analysis result of the analysis means, calculates the expected 
processing time in the distribution node, and decides the join node on the ba^is of 
this processing time. 

The decision means distributes retrieval information equally to each join 
node on the basis of the expected retrieval information amount in the decided 
distribution node. Each of the disaibutton nodes dedded by the decision means 
reoieves information from the storage means on die basis of the query analysis resxilt 
and distributes the information to another node. The join node inputs information 
distributed from the distribution node one by one and processes each inputted 
information. The distribution node and join node process information 
independendy. Each of the join nodes sorts information distributed from the 
distribution node, merges the sorted information 'when it consists of a plurahty of 
information types, joins a query on the basis of the merged information, and outputs 
the result for die query obtained from the join node. 

To assign retrieval information equally to the join nodes by the decision 
means in a more desirable form, the decision management node has a storage means 
of storing column value frequency information relating to the infomiation of die 
Storage roieans of each node. 

According to the query processing method of die present invention, die 
number of nodes can be decided in correspondence with the database operation 
winch is executed in each node. When there is a scattering in distribution of data, the 
data i$ equally distributed to each node, and each database operation to be executed 
in each node is parameterized, and the expected processing times are equalised. 
Therefore, the processing time in each node is not biased and the pipeline operation 
can be performed smoothly, 

Tsuchida: Col. 12. lines 9-35 (acmally, coL 12, line 9 - col. 13. line 101 
FIG. 12(b) is a flow chart showing the detailed procedure of the process for 
dynamic optimization (Step 223). The process for dynamic opdmization checks 
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whether diete is only one processing procediire generared by theprCM:ess for query. 
When there is only one processing procedure, diere is no need to execute the 
process for dynamic opdmizadon and die database managetnent system goes to the 
process for code interpretation execution without doing anything (Step 22300). 
When a plurality of processing procedures ate generated by the process for query 
analysis, the process for dynamic optimization calculates the predicate selectivity 
based upon die substituted constant (Step 22301). Then, the process for dynaihic 
optimization checks whether processing procedure candidates which are executed in 
parallel by a plurality of nodes are contained (Step 22302), When no corresponding 
processing procedure is contained^ the process for dynamic opdmization selects the 
processing procedure according to die threshold for access path selection and cjnds 
the processing (Step 22313). When a plurality of processing procedures which are 
executed in parallel are contained, the process for dynamic optimization inputs the 
columti value fccqucncy infoxtoation (die join column value frequency information, 
the number of rows and the number of pages in the table which are to be accessed, 
etc.) from the dictionary (Step 22303) and calculates the processing time for data 
retrieval/distribution as mentioned above by considering each system characteristic 
(Step 22304). Then, the process for dynamic optimization decides the number "p" of 
nodes to be assigned to the join process from die processing time calculated at Step 
22304 and selects die processing procedure "al" for realizing the process explained 
in FIG. 7 ftom the processing procedure candidates (Step 22305). Next, the process 
for dynamic optimization checks whether there is a scattering in the data 
retrieval/distnbution processing time in the data retrieval/ distribution nodes (Step 
22306), When there is a scattering in the data retcieval/distdburion processing rime, 
die process for dynamic optimization selects the processing procedure "a2" for 
executing the slot sorting process by nodes which can afford to execute the data 
retrieval/distribution process among the data tettiesval/ distrtbutlon nodes, that is, for 
realizing the process eicplained in FIG- 8 (Step 22307). The process for dynamic 
optimization increases the number **p" of assigned join nodes as much as "alpha" 
and selects the processing procedure "a3" for realizing the process explained in FIG. 
9 (Step 22308). Furthermore, the process for dynamic optimization compares die 
requested data output processing time with the sum of the join processing time and 
the last round of N-way merge processing time and when die former is greater than 
the latter (Step 22309), selects the procesjsing procedxore "a4" for realising the 
process in which the last round of N-way merge process is transferred to the join 
process as explained in FIG, 10 (Step 22310), In consideration of die response time, 
the load of each node, and the effect on the response performance of other 
transactions, the process for dynamic optimisation selects the best suited processing 
procedure among the processing procedures "al" to "a4" which are set above (Step 

22311) . After the processing procedure is selected^ the process for dynamic 
optimization generates the data distribution information to be used for the data 
distribution process on the basis of the column value ftequency information (Step 

22312) , When there is no column value frequency information, die process for 
dynamic optimization generates the data distribution information according to die 
join column evaluation vahic of the hash function. Finally, the process for dynamic 
optimization decides the processing procedure which is executed finally according to 
the threshold for access path selection and the process fojc dynamic opritni^ation 
ends (Step 22313). 
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The descriptions forth above do not teach or suggest the limitation "detetiriiiing a 
nximber of load and sort processes to be started in parallel based on the identified memory constraints 
and processing capabilines." 

Instead, Bayer is directed to the scheduling of synchxonized usks, but tasks are assigned to 
processors bascid on precedence and availability, whexein one computation is dependent on another, 
and wherein a processor is allocated a new task immediately after it terminates the previous one. 
Moreover, the "Loading Capacity** referred to above at coL 14, lines 25-33 of Baytr, a 
"Characterizing Parameter," relates to the maximal size of the task map, wherein the task map is a 
data structure that identifies dependencies between tasks being performed, and is used as indicated 
to assign tasks to processors based on availabihty. 

Similarly^ Tsuchida is directed a parallel processing database managetnent isystem, wherein 
the number of nodes for a join process is obtained so that a performiincc characteristic becomes 
equal to die ma^gimnm processing time, Le., a decision means detertnines which (already started) 
nodes are to be used to perform the query in order to minimize the expected processing rime. 

As a result, neither of the Bayer or Tsuchida references teach or suggest "determining a 
number of load and sort processes to be started in parallel based on die identified memory constraints 
and processing capabilities." Consequently, it cannot be said diat the cotnbination of Bayer and 
Tsuchida teaches or suggcst$, or renders obvious, the Applicants' independent claims , 

With regard to the rejections based on Bhattacharya, the Office Action states the following: 

5. Claims 1, 20, and 39 are rejeaed under 35 U-S.C, 103(a) as being 
unpatentable over Bhattacharya et al. (US Pat 5,797,000, hereinafter Bhattacharya), 

6. As per claims 1, 20, and 39, Bhattachary et aL disclose a mcdiod of 
loading data into a data store connected to a computer, the method comprising the 
steps of: 

identifying memory constraints (col. 9, lines 6-7, memory becomes a 
constraint as its capacity is a contributing fector and is limited); 

identifying processing capabilities (fig. 1, number of processors p, coL 4, line 
41- coL 5, line 8, each proccjssor is assigned with a specific niOT,bec of tasks, hence 
indicating each limited capability); and 

determining a number of load (col. 3, Hnes 1-3, join column domain and 
mples are the load, which obviously much be known in order for them to be 
partitioned and ttansferred among the duster, coL 3, lines 7 -18), and sort processes 
(coL 2, line 62 - coL 3, line 6 disclose various method of parallel sort process in 
which merge join is one cxaii^le. Since the acmal tasks assigned to the processors 
are determined during the join phase, which is part of the sort process as shown 
above, number of sort processes arc hence inherendy determined as well) to be 
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started in parallel based on the identified memory constraints and processing 
capabiliries (coL 1 , lines 24 - 28, coL 4, line 64 - col. 5, line 8: parallel tasks based on 
processing capabilities, col. 2, lines 66 - coL 3, line 6: parallel sort processing)- 

Applicants* attorney disagrees. The dted portions of the reference do not teach or siiggest 
the hmitations "determining a number of load and sort processes to be started in parallel based on the 
identified memory constraints and processing capabilides.'* 

For example, the cited portions axe sec forth below: 

Bhattacharya: CoL 9, Imes 6-7 ractually 6-1 7^ 

Each processor 104 of the system 100 is allotted an equal portion l/P of the 
memory capacity of imiversd cluster 108. In the initial portion of the transfer phase, 
fox each processor p (104) of the system 100, the tasks T.sub.m,p corresponding to 
that processor and residing on a paxticulax cluster 102 are transferred ficom that 
cluster to the viaiversal cluster 108, beginning with the task T.sub.M,p having the 
smallest estimated completion tiine and progressing in order of increasing task size 
^.e., decreasing m), until the allotted portion l/P is filled (step 532). The remaining 
tasks T.subxQjp for each processor p (104) are transferred to the dxister 102 owning 
the processor, unless they are already resident there (step 534). 

Bbaitacharya: Col. 4. line 41 — coL 5, line 8 

Referring to FIG. 1, a multiprocessor system 100 incorporating the present 
invention includes P processors 104 organized into Q equal-size clusters 102, each 
cluster concainiug P/Q processors. Each processor 104 may be eidier a uniprocessor 
or a complex of tighdy coupled processors (not separately shown) that, for die 
purposes of task assignment, are regarded as a single processor. Each cluster 102 also 
includes one or more direct access storage devices (DASD) 106, which axe magnetic 
disk drives in the system 100 shown. Each processor 104 within a cluster 102 can 
access any storage device 106 in the same cluster, but cannot access any storage 
device in any other duster 102. Processors 104 are iaterconnected to one another as 
wdl as to a single intermediate memory (SIM) 108, to which each processor has 
access- SIM 108 is also referred to herein as the universal duster, or duster 0. In 
addition to the memory 108 and storage devices 106 shown, each processor 104 also 
has its own main memory (not separately shown). In the case of a processor 104 
comprising a rightly coupled processor complex, such main memory would be 
shared by the processors of the complex. The dements shown in FIG. 1 are 
. conventional in the art, as are the interconnections between these elements. 

Processors 104 are used for the concurrent parallel execution of tasks making 
up database queries, as described below, A query may origioate either fi:om one of 
the processors 104 or firom a separate front-end query processor as described in the 
concurrently filed application of T. Borden et aL, Ser. No. 08/148,091, now U.S. Pat 
No, 5,495,606. As further described in that application, within each cluster 102 the 
query splitting and scheduling steps described bdow may be performed by an 
additional processor or processors (not shown) ginailar to processors 104; such 
additional processors would not be counted among die P/Q processors 104 per 
complex 102 to which taslcs are assigned, 
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BhattachAiyA: Col. 3, lines 1-3 (actualhr col. Z line 66 - cnl. 3. line 6^ 
In a parallel gott meigc join, the lelarions to be joined are first sorted, in 
parallel, within their clusters 102 (FIG. 1). In a naive parallel sort merge join, the 
undedying join column domain mi^t be partitioned into P ranges of equal size, and 
the mples transferred accordingly among the clusters 102. However, given a 
nonuniform distribution of tuples across the underlying domain, there is no 
guarantee that the amount of join phase work will be equal 

Bhactacharya: Col 3. lities 7-18 (fli-tn^ny IWipg 7-24) 
In accordance with the present invention, each of the P ranges is further 
divided into a relatively small number M of components, creating MP tasks 
Xsub.na,p in alL These components intentionally have nonequal task time estimates. 
For example, a reasonable approach would be to partition the tasks so that the 
estimated completion time of a task T-sub-m^p is half that of the previous task 
T.sub.m-l,p, Assuming that the quadratic output term dominates the task time 
estimates, diis can be done by partitioning the tasks in such a manner that die extent 
of the range of a given taskT,sub.m,p (to which the number of tuples in the task is 
rovighly proportional) is l/.sqroot.2 times the number of mples in task T.$ub.m-l,p. 
FIGS, lOA and lOB show an example of such a partitioning, FIG. lOA shows 
estimated task times as a function of m and p, and FIG. lOB shows actual cask times, 
also as a function of m and p. The latter may be different from the former, and will 
not be known until the join phase, when the tasks are actually performed. 

Bhattachary^ Col 1, lines 24-2g 

This invention relates generally to a method of performing a parallel query in 
a multiprocessor environment and^ more particularly, to a method for performing 
such a query with load bakndng in an environment with shared disk clusters, shared 
intermediate memory or both. 

The descriptions set forth above do not teach or suggest die limitation "determining a 
number of load and sort processes to be started in parallel based on the identified memory constraints 
and processing capabilities." 

Instead, Bhattacharya is ditectcd to a parallel join operation, wherein tasks are assigned to 
processors based on the partitioning of the domain of the join column into P ranges based on the 
number of processors and the partitioning of each range into M subranges based on the estimated 
completion time for the task. 

As a result, Bhattacharya does not teach ot suggest the limitations "determining a number of 
load and soft processes to be started in parallel based on the identified memory constcaints and 
processing capabilities." Consequendy, it cannot be said that Bhattacharya renders the Applicants' 
independent claims obvious. 
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Hintz, Bordonajo and ON fail to overcome die deficiencies of Bhattacharya. Recdl that 
Hinti: was cited only against dependent claims 2-3, 21-22 and 40-41, Botdcnaro was cited only 
against dependent claims 4-6, 23-25 and 42-44, and ON was cited only against dependent claims 7- 

II, 26-30 and 45-49. Moreover, Hintz was cited only fox detenniixing a number of build processes 
based on the number of sort processes, and for teaching that the nuimber of sort processes does not 
exceed a nimber of indexes to be built> Bordonaro was cited only for teaching that the number of 
load processes does not exceed a number of parutions to be loaded, and that the load and sort 
ptoce^scfi direcdy dependent on memory constraints, and ON was cited only for teaching to 
efficiendy utili^ce all processing capabilities required for die desired task None of rfiese teachings are 
relevant to the limitations of Applicants^ independent claims. 

Thus, Applicants submit that independent claims 1, 20 and 39 are allowable over the 
references. Further, dependent claims 2-19, 21-38 and 40-57 are submitted to be allowable over the 
references in the same manner, because they arc dependent on independent claims 1 and 12, 
respectively, and thus contain all the litnitations of independent claims 1 and 12, In addition, 
dependent claims 4-9, 11-25 and 27-44 recite additional novel elements not shown by the references. 

III. CONCLUSION 

In view of the above, it is submitted chat this application is now in good order for allowance 
and such allowance is respectfully solicited. Should the Ejcatoiner believe minor matters soil remain 
that can be resolved in a telephone interview, the Examiner is urged to call Applicants* undersigned 
attorney. 

Respectfully submined, 

GATES & COOPER LLP 
Attorneys for Applicants 

Howard Hughes Center 
6701 Center Drive West, Suite 1050 
Los Angeles, California 90045 
(310) 641;S79] 



Date: May 4, 2004 By:. 



Name: George . 
Reg. No.: 33,500 

GHG/ ■ 
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